# Transformer architecture
Sundial Base 128m
Apache-2.0
Sundial is a series of generative time series foundation models capable of zero-shot inference for both deterministic and probabilistic forecasting.
Climate Model
Safetensors
S
thuml
214
5
Ast Finetuned Audioset 10 10 0.4593 ONNX
This is the ONNX version of the AST (Audio Spectrogram Transformer) model, designed specifically for audio classification tasks and fine-tuned on the AudioSet dataset.
Audio Classification
Transformers

A
onnx-community
684
1
Falcon E 3B Instruct
Other
Falcon-E-3B-Instruct is an efficient language model based on a 1.58-bit architecture, optimized for edge devices, with excellent inference capabilities and low memory usage.
Large Language Model
Transformers

F
tiiuae
225
22
Orpheus TTS MediaSpeech
This is an Arabic model trained on the MediaSpeech dataset. Specific uses and functionalities require further information for confirmation.
Large Language Model
Transformers Arabic

O
kadirnar
21
2
Unt 8b
Apache-2.0
The Camel Model is a text generation model based on the transformer architecture, supporting Azerbaijani and trained using reinforcement learning.
Large Language Model
Transformers Other

U
omar07ibrahim
33
2
Bidi Eng Pol
Transformer-based bidirectional machine translation model supporting mutual translation among Slavic languages
Machine Translation
Transformers Supports Multiple Languages

B
allegro
185
1
Vit Large Patch14 Dinov2.lvd142m
Apache-2.0
A vision Transformer (ViT)-based image feature model, pre-trained on the LVD-142M dataset using the self-supervised DINOv2 method.
Image Classification
Transformers

V
pcuenq
18
0
Vit Liveness Detection V1.0
Apache-2.0
This model is a face liveness detection model based on the Transformer library and has achieved excellent performance on the evaluation set.
Face-related
Transformers

V
nguyenkhoa
176
1
MOMENT 1 Base
MIT
MOMENT is a series of general-purpose foundational models for time series analysis, supporting various tasks such as forecasting, classification, anomaly detection, etc., with out-of-the-box and fine-tuning capabilities.
Materials Science
Transformers

M
AutonLab
4,975
3
Speecht5 Finetuned Emirhan Tr
MIT
A Turkish text-to-speech model fine-tuned based on Microsoft SpeechT5, capable of generating high-quality Turkish speech.
Speech Synthesis
TensorBoard Other

S
emirhanbilgic
22
1
Swahili English Translation
MIT
A Transformer model specifically developed for bidirectional translation between Swahili and English, fine-tuned on 210,000 sentence pairs
Machine Translation
Transformers

S
Bildad
98
2
Birna Bert
A Transformer encoder model based on BERT architecture, specifically designed for generating RNA sequence embeddings
Text Embedding
Transformers

B
buetnlpbio
364
1
Dictalm2 It Qa Fine Tune
Apache-2.0
This is a fine-tuned version of Dicta - IL's dictalm2.0 - instruct model, specifically designed for generating Hebrew question-answer pairs.
Question Answering System
Transformers Other

D
618AI
2,900
6
Real3d
MIT
Real3D is a 2D-to-3D mapping Transformer model based on the TripoSR architecture, extending its capability to process real-world images through unsupervised self-training and automatic data filtering.
3D Vision
R
hwjiang
22
19
Codontransformer
Apache-2.0
The ultimate tool for codon optimization, capable of converting protein sequences into DNA sequences optimized for target organisms.
Protein Model
Transformers

C
adibvafa
1,327
7
Medsam Breast Cancer
Image segmentation model based on the Transformers library for image segmentation tasks in vision applications
Image Segmentation
Transformers Other

M
MichaelSoloveitchik
61
0
Segformer B3 Fashion
Other
A fashion item image segmentation model based on SegFormer architecture, specifically designed for identifying and segmenting clothing and accessories
Image Segmentation
Transformers

S
sayeed99
75.65k
21
Pllava 7b
Apache-2.0
PLLaVA is an open-source video language chatbot, obtained by fine-tuning a large image language model on video instruction following data, which can be used for the research of multimodal large models and chatbots.
Text-to-Video
Transformers

P
ermu2001
109
13
Trocr Base Spanish
MIT
Base version of TrOCR model, specifically designed for Spanish printed text, based on Transformer architecture, fine-tuned on a custom dataset
Text Recognition
Transformers Supports Multiple Languages

T
qantev
170
5
Granite Timeseries Patchtst
Apache-2.0
PatchTST is a Transformer-based time series forecasting model designed for long-term time series forecasting, utilizing subsequence patching and channel independence techniques to improve prediction accuracy.
Climate Model
Transformers

G
ibm-granite
1,505
11
Dpt Beit Large 512
MIT
A monocular depth estimation model based on BEiT Transformer, capable of inferring fine depth information from a single image
3D Vision
Transformers

D
Intel
2,794
8
Llm Jp 13b Instruct Full Jaster Dolly Oasst V1.0
Apache-2.0
A large-scale language model developed by the Japanese LLM-jp project, supporting text generation tasks in Japanese and English
Large Language Model
Transformers Supports Multiple Languages

L
llm-jp
750
8
Gpt2 Demo
Other
GPT-2 is a self-supervised pre-trained language model based on the Transformer architecture, which excels at text generation tasks.
Large Language Model
Transformers

G
demo-leaderboard
19.21k
1
Bge Base En V1.5 Ct2
MIT
BGE Base English v1.5 is a transformer-based sentence embedding model, specifically designed for extracting sentence features and calculating sentence similarity.
Text Embedding
Transformers English

B
winstxnhdw
30
0
Discogs Maest 10s Pw 129e
MAEST is a Transformer model family based on PASST, focusing on music analysis applications, particularly excelling in music genre classification tasks.
Audio Classification
Transformers

D
mtg-upf
33
0
Dogs Breed Classification Using Vision Transformers
Openrail
This is a model for image classification tasks, supporting the English language and adopting an open license.
Image Classification
Transformers English

D
AmitMidday
27
1
Hubert Base Audioset
Audio representation model based on HuBERT architecture, pre-trained on the complete AudioSet dataset, suitable for general audio tasks
Audio Classification
Transformers

H
ALM
345
2
Dinov2 Large
Apache-2.0
A vision Transformer model trained using the DINOv2 method, extracting robust visual features from massive image data through self-supervised learning
Image Classification
Transformers

D
facebook
558.78k
79
Segformer B0 Finetuned Segments Sidewalk 2
A SegFormer semantic segmentation model fine-tuned on the Segments.ai sidewalk-semantic dataset, suitable for sidewalk scene analysis
Image Segmentation
Transformers

S
thesisabc
16
0
Trocr Base Printed Fr
MIT
Transformer-based French printed text OCR model, filling the gap of French version in TrOCR models
Image-to-Text
Transformers French

T
agomberto
110
2
Japanese Hubert Base
Apache-2.0
Japanese HuBERT base model trained by rinna Co., Ltd., based on approximately 19,000 hours of Japanese speech corpus ReazonSpeech v1.
Speech Recognition
Transformers Japanese

J
rinna
4,550
68
Trocr Processor
TrOCR is a Transformer-based optical character recognition model specifically designed for handwritten text recognition, fine-tuned on the IAM handwritten database.
Image-to-Text
Transformers

T
anaghasavit
18
3
Plant Disease Classification2
An image classification model based on the transformers library for identifying and classifying plant diseases.
Image Classification
Transformers

P
ayerr
40
1
Trocr Base Ckb
An OCR system based on Transformer architecture, specifically designed for recognizing Central Kurdish text, trained using synthetic data.
Text Recognition
Transformers

T
razhan
19
0
Pythia 160m
Apache-2.0
Pythia-160M is a language model dedicated to interpretability research developed by EleutherAI. It belongs to the 160M parameter scale version in the Pythia suite and is based on the Transformer architecture, trained on the Pile dataset.
Large Language Model
Transformers English

P
EleutherAI
163.75k
31
BLEURT 20 D12
The BLEURT model implemented based on PyTorch, used for text evaluation tasks in natural language processing.
Large Language Model
Transformers

B
lucadiliello
2.6M
1
Segformer Finetuned Segments Cmp Facade
MIT
A building facade semantic segmentation model based on SegFormer architecture, capable of recognizing 12 types of architectural elements
Image Segmentation
Transformers English

S
Xpitfire
379
1
Oneformer Ade20k Swin Tiny
MIT
The first multi-task universal image segmentation framework, supporting semantic/instance/panoptic segmentation tasks with a single model
Image Segmentation
Transformers

O
shi-labs
12.96k
16
Scinertopic
MIT
A scientific term recognition model based on SciBERT, supporting NER-enhanced topic modeling
Sequence Labeling
Transformers

S
RJuro
71
7
Gpt2 Small
MIT
GPT-2 is an autoregressive language model based on the Transformer architecture. It is pre-trained on a large-scale English corpus through self-supervised learning and excels at text generation tasks.
Large Language Model
Transformers English

G
ComCom
1,032
3
- 1
- 2
- 3
- 4
- 5
Featured Recommended AI Models